Economic Hierarchical Q-Learning

نویسندگان

  • Erik G. Schultink
  • Ruggiero Cavallo
  • David C. Parkes
چکیده

Hierarchical state decompositions address the curse-ofdimensionality in Q-learning methods for reinforcement learning (RL) but can suffer from suboptimality. In addressing this, we introduce the Economic Hierarchical Q-Learning (EHQ) algorithm for hierarchical RL. The EHQ algorithm uses subsidies to align interests such that agents that would otherwise converge to a recursively optimal policy will instead be motivated to act hierarchically optimally. The essential idea is that a parent will pay a child for the relative value to the rest of the system for “returning the world” in one state over another state. The resulting learning framework is simple compared to other algorithms that obtain hierarchical optimality. Additionally, EHQ encapsulates relevant information about value tradeoffs faced across the hierarchy at each node and requires minimal data exchange between nodes. We provide no theoretical proof of hierarchical optimality but are able demonstrate success with EHQ in empirical results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hierarchical Functional Concepts for Knowledge Transfer among Reinforcement Learning Agents

This article introduces the notions of functional space and concept as a way of knowledge representation and abstraction for Reinforcement Learning agents. These definitions are used as a tool of knowledge transfer among agents. The agents are assumed to be heterogeneous; they have different state spaces but share a same dynamic, reward and action space. In other words, the agents are assumed t...

متن کامل

The MAXQ Method for Hierarchical Reinforcement Learning

This paper presents a new approach to hierarchical reinforcement learning based on the MAXQ decomposition of the value function. The MAXQ decomposition has both a procedural semantics—as a subroutine hierarchy—and a declarative semantics—as a representation of the value function of a hierarchical policy. MAXQ unifies and extends previous work on hierarchical reinforcement learning by Singh, Kae...

متن کامل

Autonomous Navigation in Partially Observable Environments Using Hierarchical Q-Learning

A self-learning adaptive flight control design allows reliable and effective operation of flight vehicles in a complex environment. Reinforcement Learning provides a model-free, adaptive, and effective process for optimal control and navigation. This paper presents a new and systematic approach combining Q-learning and hierarchical reinforcement learning with additional connecting Q-value funct...

متن کامل

Hierarchical Reinforcement Learning on the Virtual Battlefield

This paper investigates the potential of flat and hierarchical reinforcement learning (HRL) for solving problems within strategy games. A HRL method, Max-Q, is applied to a unit transportation task modelled within a simplified, discrete real-time strategy game engine, and its performance compared to that of flat Q-learning. It is shown that reinforcement learning approaches, and especially hier...

متن کامل

Emergent Hierarchical Control Structures: Learning Reactive/Hierarchical Relationships in Reinforcement Environments

The use of externally imposed hierarchical structures to reduce the complexity of learning control is common. However, it is acknowledged that learning the hierarchical structure itself is an important step towards more general (learning of many things as required) and less bounded (learning of a single thing as speci ed) learning. Presented in this paper is a reinforcement learning algorithm c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008